Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cgroups cmdline path #456

Merged
merged 8 commits into from
Mar 2, 2024

Conversation

yebo29
Copy link
Contributor

@yebo29 yebo29 commented Feb 24, 2024

Proposed Changes

When replacing a node, I noticed that the cgroups task has an old path. Old Raspbian had /boot/cmdline.txt, but that file has been moved to /boot/firmware/cmdline.txt. Upon updating that path, configuration completed successfully and k3s service started without error. When I originally built the cluster, I made that change manually. Replacing a node and using the playbook exposed this issue.

References

Checklist

  • Create a test for the path: if /boot/cmdline.txt exists and has the content we want, make the change there. If not, move on to /boot/firmware/cmdline.txt
  • Tested locally
  • Ran site.yml playbook
  • N/A (not used in reset.yml) Ran reset.yml playbook
  • Did not add any unnecessary changes
  • Ran pre-commit install at least once before committing
  • 🚀

@yebo29 yebo29 marked this pull request as draft February 24, 2024 22:51
@yebo29
Copy link
Contributor Author

yebo29 commented Feb 24, 2024

Need to add check so that we edit the right file on older Raspbian systems.

@yebo29 yebo29 marked this pull request as ready for review February 25, 2024 02:28
@timothystewart6
Copy link
Contributor

Thank you!

@yebo29
Copy link
Contributor Author

yebo29 commented Feb 26, 2024

@timothystewart6 of course! Although I'm not sure why the molecule test is failing. I tried locally with the file in both locations.

@timothystewart6
Copy link
Contributor

@yebo29 thank you! Not your fault. The ipv6 test is flaky, not sure of the root cause but I will get this merged in.

@pbolduc
Copy link
Contributor

pbolduc commented Feb 27, 2024

This worked for me. I think there is a couple of issues that may want to be considered.

  1. if someone ran an older version of this playbook on a bookworm enviornment, it would create the /boot/cmdline.txt. Even if the version of the playbook is upgraded, because /boot/cmdline.txt was created by old paybook, the updated playbook would still try to change /boot/cmdline.txt.
  2. if someone does a dist upgrade, does the cmdline.txt get moved or copied? is there a chance the old one is still there?

I manually deleted /boot/cmdline.txt before testing. Should the path be set by the distribution detection instead of file existence?

@yebo29
Copy link
Contributor Author

yebo29 commented Feb 27, 2024

This worked for me. I think there is a couple of issues that may want to be considered.

1. if someone ran an older version of this playbook on a bookworm enviornment, it would create the `/boot/cmdline.txt`. Even if the version of the playbook is upgraded, because `/boot/cmdline.txt` was created by old paybook, the updated playbook would still try to change `/boot/cmdline.txt`.

2. if someone does a dist upgrade, does the cmdline.txt get moved or copied? is there a chance the old one is still there?

I manually deleted /boot/cmdline.txt before testing. Should the path be set by the distribution detection instead of file existence?

Those are valid points. An LSB release check would be beneficial here. I think that would resolve both points, since Bookworm will use the new location if it was a new install or an upgrade, I would think. But that would require further digging/research to confirm.

I'm okay with putting this back in draft mode and adding the additional check if we agree on that. I'll let @timothystewart6 provide further input.

EDIT: Typo fix

@timothystewart6
Copy link
Contributor

@yebo29 that sounds good to me, the safer the better. No rush at all to get this merged in!

@yebo29 yebo29 marked this pull request as draft February 27, 2024 04:11
@yebo29
Copy link
Contributor Author

yebo29 commented Feb 27, 2024

Converted back to draft mode. Will push new changes when ready!

@yebo29
Copy link
Contributor Author

yebo29 commented Feb 27, 2024

Added commit, but I still need to test. Due to time, I will test tomorrow.

@pbolduc
Copy link
Contributor

pbolduc commented Feb 27, 2024

I just tested on my Pi cluster and it successfully deployed when a /boot/cmdline.txt exists. The /boot/firmware/cmdline.txt was updated correctly.

@yebo29
Copy link
Contributor Author

yebo29 commented Feb 27, 2024

I just tested on my Pi cluster and it successfully deployed when a /boot/cmdline.txt exists. The /boot/firmware/cmdline.txt was updated correctly.

Thank you for testing! I've had a jam-packed morning so just getting to this. Going to do my own round of testing and then mark as ready when I'm done and if there are no issues.

@yebo29
Copy link
Contributor Author

yebo29 commented Feb 27, 2024

Ah, I also have a lint error. Forgot to run pre-commit. Line too long. I'll look into breaking it into multiple lines.

@yebo29 yebo29 marked this pull request as ready for review February 27, 2024 20:26
@yebo29
Copy link
Contributor Author

yebo29 commented Feb 27, 2024

Pushed new changes. Was able to test locally. Marked as ready for review again.

Idea for another PR and/or issue: Add tags to tasks to speed up testing: ansible-playbook site.yml -i inventory/hosts.ini --tags rpi,k3s_server_post

@timothystewart6 timothystewart6 enabled auto-merge (squash) February 27, 2024 20:29
@yebo29
Copy link
Contributor Author

yebo29 commented Feb 27, 2024

Darn IPv6 molecule test failed again. Anything I can do to help there?

@pbolduc
Copy link
Contributor

pbolduc commented Feb 29, 2024

I just checked a new install on a fresh install of dietpi and they have not moved the cmdline.txt. Diet Pi probably should move it, however, I wonder if the detection should use /boot/firmware/cmdline.txt if it is found, otherwise fall back to /boot/cmdline.txt.

Raspberry Pi OS

pico-k3s@worker-8:~ $ find /boot -name cmdline.txt
/boot/cmdline.txt
/boot/firmware/cmdline.txt
pico-k3s@worker-8:~ $ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Diet PI

root@control-1:~# find /boot -name cmdline.txt
/boot/cmdline.txt
root@control-1:~# cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

@yebo29
Copy link
Contributor Author

yebo29 commented Feb 29, 2024

I just checked a new install on a fresh install of dietpi and they have not moved the cmdline.txt. Diet Pi probably should move it, however, I wonder if the detection should use /boot/firmware/cmdline.txt if it is found, otherwise fall back to /boot/cmdline.txt.

Raspberry Pi OS

pico-k3s@worker-8:~ $ find /boot -name cmdline.txt
/boot/cmdline.txt
/boot/firmware/cmdline.txt
pico-k3s@worker-8:~ $ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Diet PI

root@control-1:~# find /boot -name cmdline.txt
/boot/cmdline.txt
root@control-1:~# cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Did you try to run these code changes against that OS and observe what happens?

@pbolduc
Copy link
Contributor

pbolduc commented Feb 29, 2024

Did you try to run these code changes against that OS and observe what happens?

Not yet. I will try this evening (PST) after work.

@pbolduc
Copy link
Contributor

pbolduc commented Feb 29, 2024

I just re-imaged Pi OS to one Pi 4. I can see both /boot/cmdline.txt and /boot/firmware/cmdline.txt. The files are identical,

pico-k3s@worker-1:~ $ cat /boot/cmdline.txt
console=serial0,115200 console=tty1 root=PARTUUID=ff374c2a-02 rootfstype=ext4 fsck.repair=yes rootwait
pico-k3s@worker-1:~ $ cat /boot/firmware/cmdline.txt
console=serial0,115200 console=tty1 root=PARTUUID=ff374c2a-02 rootfstype=ext4 fsck.repair=yes rootwait

So, I am wondering if the existing check by grepping the file for console=|rootfstype is the best approach. I only know ansible enough to read it. I could try taking a stab at a fix. In my head the logic would be

if debian bookworm
  if exists /boot/firmware/cmdline.txt then boot_cmdline_path = /boot/firmware/cmdline.txt
  else if exists /boot/cmdline.txt then boot_cmdline_path = /boot/cmdline.txt

Thoughts?

@yebo29
Copy link
Contributor Author

yebo29 commented Mar 1, 2024

I just re-imaged Pi OS to one Pi 4. I can see both /boot/cmdline.txt and /boot/firmware/cmdline.txt. The files are identical,

pico-k3s@worker-1:~ $ cat /boot/cmdline.txt
console=serial0,115200 console=tty1 root=PARTUUID=ff374c2a-02 rootfstype=ext4 fsck.repair=yes rootwait
pico-k3s@worker-1:~ $ cat /boot/firmware/cmdline.txt
console=serial0,115200 console=tty1 root=PARTUUID=ff374c2a-02 rootfstype=ext4 fsck.repair=yes rootwait

So, I am wondering if the existing check by grepping the file for console=|rootfstype is the best approach. I only know ansible enough to read it. I could try taking a stab at a fix. In my head the logic would be

if debian bookworm
  if exists /boot/firmware/cmdline.txt then boot_cmdline_path = /boot/firmware/cmdline.txt
  else if exists /boot/cmdline.txt then boot_cmdline_path = /boot/cmdline.txt

Thoughts?

Sure, we can flip the logic around. It makes sense to me.

@yebo29
Copy link
Contributor Author

yebo29 commented Mar 1, 2024

Done. I changed the test to stat, flipped the logic around, and updated the following task to suit. I tested locally successfully, and also changed the match for bookworm to nonsense to make sure it skips based on that as well. Feel free to also verify at your earliest convenience, and I thank you for all your help @pbolduc !

@pbolduc
Copy link
Contributor

pbolduc commented Mar 1, 2024

@yebo29 Thank you for actually implementing the changes. I have to purchase Jeff's Ansible for DevOps and actually learn to write ansible. I'll have time tomorrow night to reimage my pi's USB drives to start clean. I will try with latest Raspberry Pi OS and Diet Pi and report my results. (I'm setting up 9 node - 1 control, 8 worker node k3s for testing at home)

@pbolduc
Copy link
Contributor

pbolduc commented Mar 2, 2024

Hit a snag with diet pi and I think this PR should be tested / merged separately. The snag I ran into is that diet pi is a minimal distro. Diet Pi does not install lsb-release by default. Without this package installed, the check ansible_facts.lsb.description|default("") is match(allowed_descriptions | join('|')) never matches. Running ansible 192.168.1.150 -m ansible.builtin.gather_facts --tree diet-pi-facts.txt shows the lsb block is empty.

"ansible_lsb": {},

With my limited ansible experience, I tried to add some tasks to ensure this was installed but failed to get it working as expected. This is what I tried to add near the beginning of roles/raspberrypi/tasks/main.yml. I think it best to create a diet pi "up for grabs" issue and if someone (or me when I have time) in the community would like to help, they can submit a future PR.

# some distro like diet pi do not install lsb-release by default
- name: Install lsb-release
  apt:
    name: lsb-release
    state: present
  register: lsb_release
  when: ansible_facts.lsb.description is not defined

# if lsb-release was installed, re-gather the lsb facts
- name: Gather lsb if lsb_release installed
  setup:
    gather_subset:
      - "lsb"
  when: lsb_release is changed

Even without lsb-release, Diet Pi does have the distribution properties,

        "ansible_distribution": "Debian",
        "ansible_distribution_file_parsed": true,
        "ansible_distribution_file_path": "/etc/os-release",
        "ansible_distribution_file_variety": "Debian",
        "ansible_distribution_major_version": "12",
        "ansible_distribution_minor_version": "5",
        "ansible_distribution_release": "bookworm",
        "ansible_distribution_version": "12.5",

@pbolduc
Copy link
Contributor

pbolduc commented Mar 2, 2024

Added #463 to track diet pi compatibility.

@timothystewart6
Copy link
Contributor

Thank you all!

@timothystewart6 timothystewart6 merged commit 955c6f6 into techno-tim:master Mar 2, 2024
9 checks passed
@yebo29 yebo29 deleted the raspbian-cgroups-fix branch March 3, 2024 03:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants